Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] Zephyr update to 6a0b1da158a + WA to 8764 #8804

Closed

Conversation

kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented Jan 26, 2024

#8764 + SOF side workaround to fix the failures on cAVS2.5 (other attempts to solve this issue have failed this week)

The WA works in local test -- during 1h test, the failing condition was hit twice, and in both cases, the WA recovered. Let's see how this fares in CI with wider testing.

This changes the secondary core power up routine to use the newly
introduced k_smp_cpu_start() and k_smp_cpu_resume(). This removes
the need to mirror part of the SMP start up code from Zephyr, and
no longer need to call into Zephyr private kernel code.

West update includes :

eefaeee061c8 kernel: smp: introduce k_smp_cpu_resume
042cb6ac4e00 soc: intel_adsp: enable DfTTS-based time stamping
             on ACE platforms
6a0b1da158a4 soc: intel_adsp: call framework callback function for restore
e7217925c93e ace: use a 'switch' statement in pm_state_set()
c99a604bbf2c ace: remove superfluous variable initialisation
a0ac2faf9bde intel_adsp: ace: enable power domain
4204ca9bcb3f ace: fix DSP panic during startup  (fixes c3a6274bf5e4)
d4b0273ab0c4 cmake: sparse.template: add COMMAND_ERROR_IS_FATAL
ca12fd13c6d3 xtensa: intel_adsp: fix a cache handling error
0ee1e28a2f5f xtensa: polish doxygen and add to missing doc
035c8d8ceb4b xtensa: remove sys_define_gpr_with_alias()
a64eec6aaeec xtensa: remove XTENSA_ERR_NORET

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Signed-off-by: Rander Wang <rander.wang@intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Starting with Zephyr commit e021ccfc745221c6 ("drivers: dma:
intel-adsp-hda: add delay to stop host dma"), the pause-resume
sof-test cases started failing on Intel cAVS2.5 platforms.

Add a delay loop around DMA stop code in chain DMA to workaround
the issue while a proper fix is under investigation. This allows
to resume integration of newer Zephyr versions to SOF and ensure
we detect any new regressions in time.

Link: thesofproject#8792
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
if (err == -EBUSY) {
int retries = 100;

comp_warn(dev, "dma_stop() fail chan_index = %u, retrying...",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
comp_warn(dev, "dma_stop() fail chan_index = %u, retrying...",
comp_err(dev, "FIXME 8686: dma_stop() fail chan_index = %u, retrying...",

@kv2019i kv2019i marked this pull request as ready for review January 26, 2024 19:20
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jan 26, 2024

Hmm, this is the best attempt so far. We have two fails on pause-resume
https://sof-ci.01.org/sofpr/PR8804/build2377/devicetest/index.html?model=MTLP_RVP_NOCODEC&testcase=multiple-pause-resume-50
https://sof-ci.01.org/sofpr/PR8804/build2378/devicetest/index.html

Both with nocodec tplg (= multicore) and the same test (pause-resume). To complicate matters, I cannot get this same error on a local machine. The same error occured with previous attempt ( at https://sof-ci.01.org/sofpr/PR8764/build2194/devicetest/index.html for #8764 ), so this seems to be repeat in CI.

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 26, 2024

To complicate matters, I cannot get this same error on a local machine.

One of the typical inconsistencies I've noticed recently: sof_debug seems wildly different across systems.

On a more positive note, all kernel parameters are reported at the start of each run. Example: https://sof-ci.01.org/sofpr/PR8804/build2377/devicetest/index.html?model=MTLP_RVP_HDA&testcase=verify-kernel-boot-log

@kv2019i kv2019i marked this pull request as draft January 26, 2024 22:47
@kv2019i kv2019i changed the title Zephyr update to 6a0b1da158a + WA to 8764 [DNM] Zephyr update to 6a0b1da158a + WA to 8764 Jan 26, 2024
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jan 26, 2024

Marking this one as draft, #8805 seems to have better test results, so we should probably merge that one merge and have more time to investigate the failures seen with this one. FYI @RanderWang @dcpleung

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants